Skip to content

Conversation

@DN6
Copy link
Collaborator

@DN6 DN6 commented Aug 25, 2025

What does this PR do?

Part of the on going attention refactor.

We've defined transformer blocks in the attention module. Think they would have a better home under models/transformers in a modeling_common.py file.

Fixes # (issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@DN6 DN6 requested a review from yiyixuxu September 3, 2025 15:42
return hidden_states


class LuminaFeedForward(nn.Module):
Copy link
Collaborator

@yiyixuxu yiyixuxu Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just go step further to move them into single model files? or just some of them, with Copied from maybe
that's what we have been doing with newer models, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah can do 👍🏽

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants